5 research outputs found

    Learning Multi-Object Positional Relationships via Emergent Communication

    Full text link
    The study of emergent communication has been dedicated to interactive artificial intelligence. While existing work focuses on communication about single objects or complex image scenes, we argue that communicating relationships between multiple objects is important in more realistic tasks, but understudied. In this paper, we try to fill this gap and focus on emergent communication about positional relationships between two objects. We train agents in the referential game where observations contain two objects, and find that generalization is the major problem when the positional relationship is involved. The key factor affecting the generalization ability of the emergent language is the input variation between Speaker and Listener, which is realized by a random image generator in our work. Further, we find that the learned language can generalize well in a new multi-step MDP task where the positional relationship describes the goal, and performs better than raw-pixel images as well as pre-trained image features, verifying the strong generalization ability of discrete sequences. We also show that language transfer from the referential game performs better in the new task than learning language directly in this task, implying the potential benefits of pre-training in referential games. All in all, our experiments demonstrate the viability and merit of having agents learn to communicate positional relationships between multiple objects through emergent communication.Comment: 15 page

    ImageManip: Image-based Robotic Manipulation with Affordance-guided Next View Selection

    Full text link
    In the realm of future home-assistant robots, 3D articulated object manipulation is essential for enabling robots to interact with their environment. Many existing studies make use of 3D point clouds as the primary input for manipulation policies. However, this approach encounters challenges due to data sparsity and the significant cost associated with acquiring point cloud data, which can limit its practicality. In contrast, RGB images offer high-resolution observations using cost effective devices but lack spatial 3D geometric information. To overcome these limitations, we present a novel image-based robotic manipulation framework. This framework is designed to capture multiple perspectives of the target object and infer depth information to complement its geometry. Initially, the system employs an eye-on-hand RGB camera to capture an overall view of the target object. It predicts the initial depth map and a coarse affordance map. The affordance map indicates actionable areas on the object and serves as a constraint for selecting subsequent viewpoints. Based on the global visual prior, we adaptively identify the optimal next viewpoint for a detailed observation of the potential manipulation success area. We leverage geometric consistency to fuse the views, resulting in a refined depth map and a more precise affordance map for robot manipulation decisions. By comparing with prior works that adopt point clouds or RGB images as inputs, we demonstrate the effectiveness and practicality of our method. In the project webpage (https://sites.google.com/view/imagemanip), real world experiments further highlight the potential of our method for practical deployment

    A comparative analysis of aerosol microphysical, optical and radiative properties during the Spring Festival holiday over Beijing and surrounding regions

    Get PDF
    Using ground-based data, meteorological observations, and atmospheric environmental monitoring data, a comparative analysis of the microphysical and optical properties, and radiative forcing of aerosols was conducted between three stations in different developed environments during a severe air pollution episode during the Spring Festival over Beijing. During the most polluted period, the daily peak values of the aerosol optical depth were ~1.62, ~1.73, and ~0.74, which were about 2.6, 2.9, and 2.1 times higher than the background levels at the CAMS, Xianghe, and Shangdianzi sites, respectively. The daily peak values of the single scattering albedo were ~0.95, ~0.96, and ~0.87. The volume of fine-mode particles varied from 0.04 to 0.21 µm3 µm-2, 0.06 to 0.17 µm3 µm-2, and 0.01 to 0.10 µm3 µm-2, which were about 0.3 to 5.8, 1.1 to 4.7, and 1.2 to 8.9 times greater than the background values, respectively. The daily absorption aerosol optical depth was ~0.01 to ~0.13 at CAMS, ~0.03 to ~0.14 at Xianghe, and ~0.01 to ~0.09 at Shangdianzi, and the absorption Ångström exponents reflected a significant increase in organic aerosols over CAMS and Xianghe and in black carbon over Shangdianzi. Aerosol radiative forcing at the bottom of the atmosphere varied from -20 to -130, -40 to -150, and -10 to -110 W m-2 for the whole holiday period, indicating the cooling effect. The potential source contribution function and concentration-weighted trajectory analysis showed that Beijing, the southern parts of Hebei and Shanxi, and the central northern part of Shandong contributed greatly to the pollution

    Cu(II)/Proline-Catalyzed Reductive Coupling of Sulfuryl Chloride and P(O)–H for P–S–C Bond Formation

    No full text
    A considerably improved method for the Cu-catalyzed coupling of sulfuryl chloride with P­(O)–H was described. Using commercially available l-proline as the ligand decreased the precatalyst loading, broadened the substrate scope and greatly promoted the efficiency of the coupling reaction. Moreover, gram-scale preparation, easy-to handle and recyclable catalyst featured this transformation
    corecore